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Abstract. The problem of monotonicity testing over the hypergrid and its special case, the hy- 
percube, is a classic, well-studied, yet unsolved question in property testing. We are given query 
access to / : [k] n H > R (for some ordered range R). The hypergrid/cube has a natural partial order 
given by coordinate-wise ordering, denoted by -<. A function is monotone if for all pairs x -< y, 
f( x ) < fit/)- The distance to monotonicity, e/, is the minimum fraction of values of / that need to 
be changed to make / monotone. 

For k — 2 (the boolean hypercube), the usual tester is the edge tester, which checks monotonic- 
ity on adjacent pairs of domain points. It is known that the edge tester using 0{e~ 1 n log |R|) 
samples can distinguish a monotone function from one where e/ > e. On the other hand, the 
best lower bound for monotonicity testing over the hypercube is min(|R| 2 , n). This leaves a qua- 
dratic gap in our knowledge, since |R| can be 2". We resolve this long standing open problem 
and prove that 0(n/e) samples suffice for the edge tester. For hypergrids, known testers require 
0(e~ 1 n log k log |R|) samples, while the best known (non-adaptive) lower bound is ri(e _1 n log k). 
We give a (non-adaptive) monotonicity tester for hypergrids running in 0(e~ 1 n log k) time. 

Our techniques lead to optimal property testers (with the same running time) for the natural 
Lipschitz property on hypercubes and hypergrids. (A c-Lipschitz function is one where \f(x) — 
f(y)\ < c\\x — 3/ In fact, we give a general unified proof for 0(e~ 1 n log fc)-query testers for a 
class of "bounded-derivative" properties, a class containing both monotonicity and Lipschitz. 

1. Introduction 

Given query access to a function / : D i— > R, what can we learn about the properties of / 
without reading all of /? The field of property testing [RS96, GGR98J formalizes this question 
by dealing with relaxed decision problems. A property V is a subset of all functions; we say 
that a function / has property V if / G V . The distance between / and V, denoted by is 
the minimum number of places at which / must be changed to have the property V . Formally, 
e f)V = mm geV (|{x|/(x) ^ ^(a?)}|/|D|) . 

Given a parameter e G (0,1), the classic property testing question is to design a randomized 
algorithm for the following problem. If etp = (meaning / has the property), the algorithm 
must accept with probability > 2/3, and if £/,-p > s, it must reject with probability > 2/3. If 
Et-p G (0, e), then any answer is allowed. Such an algorithm is called a property tester for V . The 
quality of a tester is determined by the number of queries it makes, and the running time of the 
tester. A one-sided tester never errs if the function satisfies the property. A non-adaptive tester 
decides all of its queries in advance. In other words, the queries are independent of the answers it 
receives. 

A classic property studied in this framework is monotonicity. Typically, one assumes a total 
order on the range R (so R may be assumed to be a subset of the reals), and a partial order X on 
the domain D. A function / is monotone if f{x) < f{y) whenever x H y. 
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A special case of the domain arising in many applications is that of ra-dimensional hypergrids; 
that is, D = [fc] n {^] Of particular interest is the n-dimensional hypercube where k = 2 and is often 
denoted as D = {0, 1}™. The hypergrid/hypercube defines the natural coordinate-wise partial 
order: x ■< y, iff G [n], Xi < m. 

Monotonicity has been studied extensively in the past decade [EKK+00] iGGL+OOi lDOL+991 
ILROTI lFLN+021 lACHHl iFisM IHK081 IPRE.06L IACCL06I IBRW051 lBO.I+091 IBCGSM12I IBBM12j . 
For the hypercube domain, the tester of choice has been the edge tester. Let H be the pairs 
corresponding to the edges of the hypercube. That is, pairs that differ in precisely one coordinate. 
The edge tester picks a pair in H uniformly at random and checks for monotonicity of this pair. 
When the range is boolean, a classic result of Goldreich et al. [GGL + 00] is that 0{n/e) samples 
suffice to give a bonafide montonicity tester. Dodis et al. [DGL + 99| generalize the above result 
to show that 0(e _1 nlog |R|) samples suffice for a general range R. In the worst case, |R| = 2 n , 
and so this gives a 0(n 2 /e)-query tester. Briet et al. [BCGSM12J give an Q(n/s)-lowex bound for 
non-adaptive, one-sided testers, and in a recent breakthrough, Blais, Brody, and Matulef [BBM12 
prove that J7(min(n, |R| 2 )) samples are required by any tester. It has been an outstanding open 
problem in property testing to give an optimal bound for monotonicity testing over the hypercube. 
We resolve this by showing that the edge tester is indeed optimal (when |R| > y/n). 

Theorem 1. The edge tester is an 0(n/e)-query (non-adaptive, one-sided) monotonicity tester for 
functions f : {0, 1}™ M> R. 

When the domain is the hypergrid [k] n , Dodis et al |DGL + 99] give a 0(n log k log |R|/e)-query 
monotonicity tester. Since |R| can be as large as k n , this gives an 0(e^ 1 n 2 log 2 /c)-query tester. In 
a recent, unpublished result, Blais et al. |BJRY12] prove a lower bound of 0(nlog/s) queries for 
non-adaptive monotonicity testers (for sufficiently large R). 

In this paper, we give a clean 0(s~ 1 n log /c)-query monotonicity tester on hypergrids that gen- 
eralizes the edge tester. This tester is also a uniform pair tester, in the sense it defines a set H of 
pairs, picks a pair uniformly at random from it, and checks for monotonicity among this pair. The 
pairs in H also differ in exactly one coordinate, as in the edge tester. Furthermore, this difference 
is fixed to be a power of 2. Observe that this reduces to the edge tester when k = 2. 

Theorem 2. There exists a non-adaptive, one-sided 0(e~ 1 n log k)-query monotonicity tester for 
functions f : [k] n i— > R. 

We discuss some other previous work on monotonicity testers for hypergrids. For the total 
order (the case n = 1), which has been called the monotonicity testing problem on the line, 
Ergiin et al jEKK+00] give an 0(e 1 log /c)-query tester, and this is opti mal |EKK + 00| IFis04| . The 



elegant concept of 2-TC spanners introduced by Bhattacharyya et al [BGJ + 09] construct give a 
general class of monotonicity testers (for richer posets than just hypergrids) although it is known 
that such constructions give testers with polynomial dependence of n for the hypergrid jBGJ + 12 



For constant n, Halevy and Kushilevitz |HK08j give a 0(s~ 1 log /c)-query tester (although the 
dependency on n is exponential). 

Another property that has been studied recently is that of a function being Lipschitz: a function 
/ : [k] n \- > R is called c-Lipschitz if for all x,y, \ f(x) — f(y)\ < c\\x — y\\\. The Lipschitz testing 
question was introduced Jha and Raskhodnikova [JR11], who show that for the range R = SZ, 
0{n 2 / (5e)) queries suffic^Jfor Lipschitz testing. They also give a 0(e~ l log£;)-query tester for the 
line. For general hypergrids, Awasthi et al. AJMR12J recently give an 0({5e)^ 1 n 2 k log A;)-query 



^More generally, a hypergrid is defined as TI" =1 [fci] ; all our results extend to the different fc; case but for brevity's 
sake we will stick to the symmetric case. 

2 One can also get a bound of 0(nD/ (5e)), where D is a bound on range of values that / takes. 



tester when the range is R = 52j^J As for lower bounds, Jha and Raskhodnikova |JR11| give an 
0(n)-query lower bound for the Lipschitz testing question on the hypercube, and an fi(log /c)-query 
lower bound for that on the line. The recent manuscript by Blais et al. [BJRY12] mentioned above 
also gives an f2(ralog /c)-query lower bound for non-adaptive Lipschitz testers. 

Testing the Lipschitz property is a natural and important question that arises in many applica- 
tions. For instance, given a computer program, one may like to test the sensitivity of the program's 
output to the input. This has been studied before, for instance in [CGLNli] . however, the solution 
provided looks into the code to detect if the program satisfies Lipschitz or not. The property testing 
setting is a black-box approach to the problem. Jha and Raskhodnikova |JR11| also provide an 
application to privacy; a class of mechanisms known as Laplace mechanisms proposed by Dwork et 
al. [DMNS06J achieve privacy in the process of outputting a function by adding a noise proportional 
to the Lipschitz constant of the function. To find the Lipschitz constant, one typically needs to 
guess c and test whether the function is c-Lipschitz. 

We give a unified tester for the Lipschitz property that improves all known results and matches 
existing lower bounds. In fact, the testers are the same as that of monotonicity; the pairs are chosen 
at random from the same set H, and checked for the Lipschitz condition instead of monotonicity. 
To our knowledge, no non-trivial result was known for general ranges with arbitrarily small 5. 

Theorem 3. There exists a non-adaptive, one-sided 0(e~ 1 nlog k)-query c-Lipschitz tester for func- 
tions f : [k] n i->- R. 

Our techniques apply to property testing of a much larger class of functions that contains mono- 
tonicity and Lipschitz. We call it the bounded derivative property, or more technically, the (a, (3)- 
Lipschitz property. Given parameters a, f3, with a < (3, we say that a function / : [k] n i— > R has 
the (a, /3)-Lipschitz property if for any x G [k] n , and y obtained by increasing exactly one coordi- 
nate of x by exactly 1, we have a < f(y) — f{x) < j3. Note that when (a = 0,/3 = ooQ we get 
monotonicity. When (a = — c, /3 = +c), we get c-Lipschitz. Our above tester can be generalized for 
the (a, /3)-Lipschitz property. 

Theorem 4. There exists a non-adaptive, one-sided 0(e~ 1 n log k)-query (a, /3) -Lipschitz tester for 
functions f : [k] n \— > R, for any a < (3. 



Although |Theorem4 implies all the other theorems stated above, in what follows we will first 
prove Theorem 1 and Theorem 2 before giving a whole proof of Theorem 4 The reason is mainly 
notational; the final proof is a little heavy on notation, and the authors believe that the proof of 
the monotonicity theorems illustrate the techniques invented in this paper. 



2. The Proof Roadmap 

The challenge of property testing is to relate the tester behavior to the distance to the property. 
Consider monotonicity over the hypercube. To argue about the edge tester, we want to show that 
a large distance to monotonicity implies many violated edges. Most current analyses of the edge 
tester go via what we could call the contrapositive route. If there are few violated edges in /, 
then they show the distance to monotonicity is small. This is done by modifying / to make it 
monotone, and bound the number of changes as a function of the number of violated edges. There 
is an inherently "constructive" viewpoint to this: it specifies a method to convert non-monotone 
functions to monotone ones. 



One can also get a bound of 0(nD log D/(5e)), where D is a bound on range of values that / takes. [AJMR12] 
also give a 2-sider tester making O ^ gfcy^jgg k ^ q Uer j es 

4 If the reader is uncomfortable with the choice of /3 as oo, /3 can be thought of as much larger than any value in /. 
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Implementing this becomes difficult when the range becomes large, and bounds degrade with R. 
For the Lipschitz property, this route becomes incredibly complex. A non-constructive approach 
may give more power, but how does one get a handle on the distance? The violation graph provides 
a method. The violation graph has an edge between any pair of comparable domain vertices (x, y) 
(x -< y) if f(x) > f(y). The following theorem can be found in [FLN+02] (Corollary 2 of the STOC 
proceedings) . 

Theorem 5 ( jFLN+02] ). The size of the minimum vertex cover of the violation graph is exactly 
£/|D|. As a corollary, the size of any maximal matching in the violation graph is at least ^£j|D|. 

Can a large matching in the violated graph imply there are many violated edges? Lehman and 
Ron jLKOlj take this view. Using this, they reduce the monotonicity testing on the hypercube 
problem to certain routing problems on the hypercube. In particular, they show that if for any k 
source-sink pairs on the directed hypercube, at least k{i(k) edges need to be deleted in order to 
pairwise separate them, then 0(n/efi(n)) queries suffice for the edge tester. Therefore, if fi(n) is at 
least a constant, one gets a linear query monotonicity tester on the cube. Lehman and Ron [LR01 
explicitly ask for bounds on /x(n). Briet et al. [BCGSM12J showing that n(n) could be as small as 
thereby putting an f2(ra 3 / 2 /e) bottleneck to the above approach. 

In the reduction above, however, the function values are altogether ignored. That is, once one 
moves to the combinatorial routing question on source-sink pairs, the fact that they are related by 
actual function values is lost. In particular, lower bounds on the combinatorial problem do not give 
lower bounds for monotonicity; and in fact, as we show, they can't. Our analysis crucially uses the 
value of the functions to argue about the structure of the maximal matching in the violation graph. 

2.1. It's all about matchings. Our proof is intimately connected with the actual function values 
and is non-constructive. The key insight is to move to a weighted violation graph. The weight of 
violation (x, y) depends on the property at hand; for now it suffices to know that for monotonicity, 
the weight of (x, y) (x -< y) is f(x) — f(y). This can be thought of as a measure of the magnitude of 
the violation. We now look at a maximum weighted matching M in the violation graph. Naturally, 
this is maximal as well, so we know |M| > i^|D|. 

Our testers are uniform pair testers, that is, all our algorithms pick a pair uniformly at random 
from a predefined set H of pairs, and check the property on that pair. Our whole analysis is based 
on the construction of a one-to-one mapping (not quite, but not far from the truth) between pairs 
in M to violating pairs in H. This mapping implies |H| > |M|, and thus the uniform pair tester 
succeeds with probability 0(e/|D|/|H|), implying 0(|H|/ej|D|) queries suffice. 

To obtain this mapping, we first decompose M into sets Mi, M%, . . . , Mt such that each pair in 
M is in at least one Mj. Furthermore, we partition H into sets H\, H2, . . . ,Ht, respectively. Mi's 
are clearly matchings since M was one. The crucial thing regarding the partition of H is that each 
of the His are perfect matchings of the domain D. For instance, in the hypercube case, Mj is the 
collection of pairs in M whose ith coordinates differ, and Hi is the collection of pairs differing only 
in the ith coordinate; for the hypergrid case, the partitions are more involved. 

We map each pair in Mi to a unique violating pair in Hi. For simplicity, let's forget the subscripts 
and call the matchings M and H. Let's denote the endpoints of M by the set X. We now consider 
the alternating paths and cycles generated by M and H (note we use M to generate these, not M). 
Starting from a point x £ X, we walk along the alternating objects, beginning with the Hi-edge. 
This gives a sequence of vertices, which we call S x , for each x £ X. We terminate this sequence if 
we ever reach a vertex which is M- unmatched (recall H is a perfect matching), or if we encounter 
another vertex of X. In this way we get a collection of sequences, and it is not hard to see they are 



disjoint. A detailed description is given in §3 Our main technical lemma shows that there must 



exist at least one violating H-p&ir in each S x - The mapping is now complete. 
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FIGURE 1. The alternating path: the curved lines connect pairs of M, and the dashed 
lines are edges of H. 



2.2. Getting the violating ff-pairs. But why should each alternating path have a violating im- 
pair? As mentioned earlier, let us focus on monotonicity on the boolean hypercube, so an //-pair 
is just an edge. Consider M , the pairs of M which differ on the first coordinate, and H is the set of 
edges in the dimension cut along this coordinate. Let (x, y) E M, and say {x)\ = giving us x < y. 
(We denote the ath coordinate of x by (x) a .) Recall that the weight of this violation is f(x) — f(y). 
The first step from x (in S^) leads to si. Note that si -< y. Suppose we stopped here because 
si was M-unmatched. Now for the crucial observation. Delete (x,y) from M and add (si,y). If 
(x, si) was not a violation, so f(x) < /(si^J then f(si) — f{y) > f (x) — f(y). We obtain a new 
matching with larger weight, contradicting the choice of M. Maybe s\ was not M-unmatched, but 
was in X. That is, the matched pair (si, S2) is in M. Observe, however, that if (si, S2) G M, we get 
s\ y S2- This is because (si)i = 1 (since (si)i 7^ x\) and (si,S2) must differ on the 1st coordinate 
implying (^2)1 = 0. Note that S2 -< y, so we could replace pairs {x, y) and (S2, si) in M with (s2, y)- 
Again, if (x, si) is not a violation, then /(S2) — f(y) > [f(s2) — f(si)] + [f(x) — f(y)], contradicting 
the maximality of M. 

With care, this argument can be carried over for longer chains and a description of this is given 
Let us demonstrate it a little further. Again, we start with (x, y) G M, and x\ = 0. Following 



m 



the sequence S x , the first term s% is x projected "up" dimension cut H. The second term is obtained 
by following the M-pair incident to s\. Suppose it exists, and the other end is S2- In the next 
step, S2 is projected "down" along H to get S3. Suppose S2 ~< si- Then, one can remove (x,y) and 
(s\,S2) and add (x,s\) and (s2,y) to increase the matching weight. (We just made the argument 



earlier; the interested reader may wish to verify.) Hence, S2 >- s%, and we get the left part of Fig. 1 
Observe that x -< y and s\ -< 82- By the nature of the dimension cut H , x -< S3 and S\ -< y. So, if 
S3 is unmatched and (S2, S3) is not a violation, we can again rearrange the matching to improve the 
weight. We alternately go "up" and "down" H\ in traversing S x , because of which we can modify 
the pairs in M and get other matchings in the violation graph. The maximality of M imposes 
additional structure, which leads to violating edges in H. 

In general, the spirit of all our arguments is as follows. Take an endpoint of M and start walking 
along the sequence given by the alternating paths generated by M and H. Naturally, this sequence 
must terminate somewhere. If we never encounter a violating pair of H during the entire sequence, 
then we can rewire the matching M and increase the weight. Contradiction! 

Observe the crucial nature of alternating up and down movements along H. This happens 
because the first coordinate of the points in S x switches between the two values of and 1 (for 
k = 2). Such a reasoning does not hold water in the hypergrid domain. The structure of H needs 
to be more complex, and is not as simple as a partition of the edges of the hypergrid. Consider 
the extreme case of the line [k]. Let 2 r be less than k. We break [k] into contiguous pieces of 



^We are assuming here that all function values are distinct; as we show in Claim 5 this is without loss of generality. 



length 2 r . We can now match the first part to the second, the third to the fourth, etc. In other 
words, the pairs look like (l,2 r + 1), (2,2 r + 2), . . ., (2 r ,2 r+1 ), then (2 r+1 + l,2 r+1 + T + 1), 
(2 r+1 + 2, 2 r+1 + 2 r + 2), etc. We can construct such matchings for all powers of 2 less than k, 
and these will be our H^s. Those familiar with existing proofs for monotonicity on [k] will not be 
surprised by this set of matchings. All methods need to cover all "scales" from 1 to A; (achieved by 
making them all powers of 2 up to k). It can also be easily generalized to [k} n . 

What about the choice of M? Simply choosing M to be a maximum weight matching and setting 
up the sequences S x does not seem to work. It suffices to look at [k] 2 and the matching H along 
the first coordinate where r = 0, so the pairs are {(x, x')\(x)% = 2i — 1, {x')\ = 2i, (x)<z = (cc'^}. 
A good candidate for the corresponding M is the set of pairs in M that connect lower endpoints 
of H to higher endpoints of H. Let us now follow S x as before. Refer to the right part of Fig. 1| 
Take (x, y) £ M and let x ~< y. We get si by following the H -edge on x, so si >~ x. We follow the 
M-pair incident to s\ (suppose it exists) to get S2- We could get S2 >- sx. It is in S3 that we see a 
change from the hypercube. We could get S3 >- S2, because there is no guarantee that S2 is at the 
higher end of an H-p&ir. This could not happen in the hypercube. We could have a situation where 
S3 is unmatched, we have not encountered a violation in H, and yet we cannot rearrange M to 
increase the weight. For a concrete example, consider x = (1, 1), y = (4, 3), s% = (2, 1), S2 = (5, 2), 
S3 = (6,2) (as in Fig. 1) and f(x) = f{s\) = f(ss) = 1, f(y) = /(S2) = 0. Some thought leads to 
the conclusion that S3 must be less than S2 for any such rearrangement argument to work. 

The road out of this impasse is suggested by the two observations. First, the difference in 1- 
coordinates between s% and S2 must be odd. Next, we could rearrange and match (x, S2) and 
(si,y) instead. The weight may not increase, but this matching might be more amenable to the 
alternating path approach. We could start from a maximum weight matching that also maximizes 
the number of pairs where coordinate differences are even. Indeed, the major insight for hypergrids 
is the definition of a potential <3? for M, that is a generalization of this idea. The potential $ is 
obtained by summing for every pair (x, y) G M and every coordinate a, the largest power of 2 
dividing the difference \(x) a — (y) a \- We can show that a maximum weight matching that also 
maximizes $ does not end up in the bad situation above. With some addition arguments, we can 



generalize the hypercube proof. We describe this in §5 Observe that the potential with alternating 
paths give a unified and optimal proof for two very "different" hypergrids: the hypercube and the 
line. 



2.3. Attacking the generalized Lipschitz property. One of the challenges in dealing with the 
Lipschitz property is the lack of direction. The Lipschitz property, defined as Vx, y, \f(x) — f(y)\ < 
|| x — 2/|| 1, is fundamentally an undirected property, while monotonicity is directed in nature. In 
monotonicity, a point x only "interacts" with the subcube above and below x, while in Lipschitz, 
constraints are defined between all pairs of points. Previous results for Lipschitz testing require very 
technical and clever machinery to deal with this issue, since arguments analogous to monotonicity 
just do not work. The alternating paths argument given above for monotonicity also exploits this 
directionality, as can be seen by heavy use of inequalities in the informal calculations. Observe that 
in the monotonicity example for hypergrids in Fig. 1, the fact that S3 >~ S2 (as opposed to S3 -< S2) 
required the potential <I> (and a whole new proof). The (a, /3)-Lipschitz property creates even more 
problems, since constraints are not symmetric between x and y. 

A subtle point is that while the property of Lipschitz is undirected, violations to Lipschitz are 
"directed". If \f(x) -f(y) | < 1 1 x - y \ \ 1, then either f(x)-f(y) > \\x-y\\i or f(y) -f(x) > ||x-y||i, 
but never both. This can be interpreted as a direction for violations. In the alternating paths for 
monotonicity (especially for the hypercube), the partial order relation between successive terms 
follow a fixed pattern. This is crucial for performing the matching rewiring. 
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As might be guessed, the weight of a violation (x, y) becomes max(/(x) — f(y) — \\x — y\\i, f(y) — 
f(x) — \\x — y\\i). For the generalized Lipschitz problem, this is defined in terms of a pseudo- 
distance over the domain. We look at the maximum weight matching as before (and use the same 
potential function The notion of "direction" takes the place of the partial order relation in 
monotonicity. The main technical arguments show that these directions follow a fixed pattern in 
the corresponding alternating paths. Once we have this pattern, with some work we can perform 
the matching rewiring argument for the generalized Lipschitz problem. 



3. Alternating Paths and the Sequence S x 

In this section we formally define the sequences as described in proof roadmap above. We will 
need three objects: M, the matching of violating pairs, M, one of the parts of the classification of 
M, and H, a matching of D. Usually, for each M, we will find some appropriate H. 

We require the following technical definition. This is necessary for only for the hypergrid proof. 
This arises because we will use matchings that are not necessarily perfect. Observe that a perfect 
matching H is always adequate. 

Definition 1. The matching H is adequate for if for every violation (x, y), both x and y participate 
in the matching H . 

We will henceforth assume that H is adequate. The symmetric difference of M and H is a collection 
of alternating paths and cycles. Let X be the endpoints of M that are present in these. (Note that 
if a pair in M is actually an edge in H, then the endpoints of M are not present in the alternating 
paths/cycles.) We will denote the set of alternating paths/cycles that contain some vertex of X by 
A. 

We define the sequence for all x 6 X as follows. 

(1) The first term S x (0) is x. 

(2) For even i, S x (i + 1) = H(S x (i)). 

(3) For odd i: if S x (i) £ X, or is M-unmatched, then S x terminates. 
Otherwise, S x (i + 1) = M(S x (i)). 

Note that because H is adequate for M, this sequence is well defined. Indeed, it never terminates 
at an even term, since H(S x (i)) always exists. As described in the introduction, an intuitive way of 
understanding S x is by looking at what happens in A. All the paths/cycles of A containing points 
of X can be partitioned into contiguous sequences. Pick any vertex in x £ X and start walking 
along the if- link incident to it. We stop when we reach a vertex in X . We keep repeating this 
procedure until all paths/cycles in A are subpartitioned into the sequences. 

Observe that any cycle containing some point of x 6 X also contains M(x) = M(x) 6 X. Hence, 
this decomposition breaks the cycle into a collection of paths with the following property. The 
first and last vertices these paths are in X, and all internal vertices in the path are not in X. The 
starting and ending edges are in H. (The paths are undirected, so the label of start and end is 
quite arbitrary.) Every vertex in X is the start or end of some path. The sequence is simply the 
ordered list of vertices (starting from x) in the path containing x. The following proposition, whose 
proof follows from the discussion above, captures basic properties of S x . We use T(x) to denote 



the last vertex in S x . We refer the reader to Fig. 2 for an illustration of the procedure described 



Proposition 1. Assume that H is adequate for M. 

• Every S x terminates. 

• T(x) is M-unmatched, orT(x) £ X in which case, Sx( x ) is the reverse ofS x andT(T(x)) = x. 

• For x, y £ X , either y = T(x) or S x and S y are disjoint. 
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M{T{xJ) » 

Sy(2) S y (5) S y (S) S y (9) = T(y) 

H 1 1 1 



S y (3) S y (4) S y (7) Sy(S) 

FIGURE 2. An illustration of the alternating paths. The dashed lines are M-edges; the 
solid lines are H edges. Consider the edge (x,y) £ M. Thus, x,y € X. The top alternating 
path starting from x is Sj used as a shorthand for S x (i). S x terminates at sj = T(x) since 
T(x) also lies in X. The alternating path starting from T(x) would be the same as S x but 
reversed, and would end at x. The path starting from y going below is S y and ends at 
Sj,(9) = T(y). This could be because T{y) is M-unmatchcd. Note that T(y) could also be 
M(T(x)) (in which case, this would be an alternating cycle). 



As stated in the introduction, the main part of the proofs will be to show that there exists a violated 
if-pair in S x , for all x £ X. Note that this would imply \M\ < \H\ since the S x 's are disjoint. For 
the sake of contradiction, we assume there is some S c without a violated H-p&ii. With this, the 



following two lemmas contradict the termination condition of Prop. 1 
Lemma 1 (Progress Lemma.). For odd i, S x (i) is not ~M.-unmatched 
Lemma 2 (Disjointedness Lemma.). For odd i, S x (i) ^ X. 



it is clear from Proposition |Prop. 1 that the two lemmas imply non-termination. The proofs of 



both lemmas essentially show that if there are no H- violating pairs in S x thus far, and the condition 
of the lemma didn't hold, then a better M can be found. 

4. MONOTONICITY ON BOOLEAN HYPERCUBE 



We prove Theorem 1 The weight of a pair (x,y) is defined to be f(x) — f(y) if x ~< y, and is 
— oo otherwise. Thus violating pairs have positive weight. We choose a maximum weight matching 
M of pairs. Note that every pair in M is a violating pair. Furthermore M is also is a maximal 
family of disjoint violating pairs, and therefore, |M| > \ef ■ 2 n . We distribute the pairs in M into 
n classes Mi, . . . , M n . Mi contains the pairs which differ in coordinate i. Note that each pair in 
M is in some class. 

We denote the set of all edges of the hypercube as H. We partition H into Hi, ... , H n where Hi 
is the collection of hypercube edges which differ only in the ith coordinate. Note that each Hi is a 
perfect matching and is hence always adequate. We let Cj C Hi denote the violating pairs of H. L . 
The following is the main charging lemma. 

Lemma 3. For all 1 < r < n, \M r \ < \C r \. 



The above lemma proves Theorem!] the probability that the edge-tester succeeds is precisely 



1 

Yl \°r\ ^ l M l/( n2 ™) > £ l 2n - 

Therefore, 0(n/e) queries suffice. 

Henceforth, we lose the subscript r. Recall X is the set of endpoints of M. We feed M, H, M into 



the machinery of §3 to obtain the sequences S x for x G X. Fix x G X, and for brevity's sake we 



let Si denote S^) for t > 0. We also let s_i denote M(x). Without loss of generality, assume 
x[r] = (it could be that or 1; the argument in that case is analogous). Recall that for even i, 



(si, Si+i) G H and for odd i, (sj, Sj + i) G M. The reader may find it useful to refer to Fig. 2 We 
start off with the following observation. 

Proposition 2. For i = 0, 3 (mod 4), Sj[r] = 0. For i = 1,2 (mod 4), Sj[r] = 1. 

Proof. If i is odd, then (sj_i, Sj) G -H\ Therefore, Sj[r] 7^ Sj_i[r]. If i = 1 (mod 4), then by induction 
s«-i[t] = and thus Sj[r] = 1. i = 3 (mod 4) case is similar. If i is even, then (sj_i,Sj) G M. 
Furthermore, Sj_i ^ X. So, Sj_i[r] = s$|V]- A similar argument as above finishes the proof. □ 

For contradiction's sake, we assume (sj_i, Si) ^ C for any odd i. Using the above proposition, this 
implies 

(*) /(*-!)-/(*) >0, Vi = 3 (mod4) >0, Vi = 1 (mod 4) 

Note the strict inequalities used; this is without loss of generality although we prove it formally 



later in Claim 5 The above property will now be assumed to hold in the remainder of the proof. 
In the end we will get a contradiction. 

For odd i, the pair (sj,Sj+i) lies in M, but a priori we do not know which of these two is the 
ancestor and which is the descendant. The following lemma characterizes this. 

Lemma 4. Let i be odd. Then, 

\/i = 1 (mod 4), Si + \ y Si; Vz = 3 (mod 4), Sj y Sj+i 

Proof. The proof is by induction on i. Assume the claim is true for all odd j < i, for some 
% = 1 (mod 4). The proof for the other case is similar. By construction, the base case (i = — 1) can 
be checked to hold true. Suppose for contradiction, >~ Sj+i. We now construct a matching M' of 
larger weight than M as follows. Delete the set of M-edges E_ := {(sj, Sj + ±) : j odd , — 1 < j < i}, 
and add the set of edges E + := 

(s_i, si) U {(sj-i, Sj +2 ) : j odd , 1 < j < i - 4} U (si_ 3 , s i+ i) 

Check that M — EL + E + is a valid matching which leaves Sj, unmatched. Now we consider 
the weights. The weight of EL, by induction, is 

W. = If (s ) - /(*_!)] + [/( Sl ) - f(s 2 )] + [/( S4 ) - f(s 3 )] + ■■■ + [/( Sl _x) - /( Si _ 2 )] + [f(s i+ i) - f( Sl )} 

Observe the signs changing from term to term due to induction hypothesis, except for the last 
term which is assumed for the sake of contradiction. Also by induction, note that (sj-i, Sj+2) = 
(sj©e r , Sj+i® e r ), where e r is a vector with either +1 or —1 on the rth coordinate and everywhere 
else, and © is the coordinate wise sum operator. This is because (sj-i, Sj) and (sj+i, are both 
in H, and it suffices to show that the rth coordinates of Sj-i and s J+ i are different. This follows 



from Prop. 2 Therefore, when j = 3 (mod 4), and therefore by induction, sj >- Sj+i, we have 
Sj-i y Sj + 2 Similarly, when j = 1 (mod 4), Sj-\ -< Sj+2- We get that whenever 1 < j = 1 (mod 4), 
s J+ 2 >*- Sj-i and whenever (i — 2) > j = 3 (mod4), Sj-i y sj + 2- By the assumption, we get 
Sj_3 >- Si y Si+i. In particular, Sj_3 >>- Sj which in turn, for the sake of contradiction, we have 
assumed y Sj_i. Using this, we get the weight of E + is precisely 

W + = [/( Sl ) - /(*_!)] + [/( S0 ) - /(S 3 )] + • • • + [f(s t - 5 ) - f(8i- 2 )] + [f(si) - /( ai _ 3 )] 

Thus, we get the weight of the new matching is precisely w(M.)—W-+W+ = w(M.)+f(si)— /(sj_i). 
By ([*]), we get that f(si) > f(si-i) contradicting the maximality of M. □ 

Armed with this handle on the ancestor-descendant relationships, we prove the progress and dis- 



jointedness lemmas alluded to in §3 



Lemma 5. For odd i, Si is not ~NL-unmatched. 
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Proof. Suppose it is. Then as in the proof of the previous lemma we can find a better matching. 
Once again, assume i = 1 (mod 4). We delete the set of edges EL := {(sj, Sj+i) : j odd , — 1 < j < 
i — 2} and add the set of edges E + = (s_i, si)U{(sj_i, Sj + 2) ■ j odd , 1 < j < i — 3}. Lemma4 shows 
that M — E- + E + is a valid matching whose weight is, as before, w(M.) + /(sj) — /(sj_i) > w(M) 
by Q. □ 

Lemma 6. For odd i, Sj ^ X. 



Proof. This is really just a corollary of Lemma 4 
s i[ r ] = 1- By Lemma 4 M(s 
^ X r . If i 



3 (mod 4), then s, 



[r] = and M(s 



Suppose i = 
and so Si+i[V] = 1. 



Hence, 



-< S{. Again, Si ^ X r . 



1 (mod 4). Then, by Prop. 2 
s i+i) ^ Af, and thus 
□ 



We conclude that for any x G X, if Condition ([*]) holds, then S x can never terminate. The non- 



termination contradictions Prop 
violated edge in each S 



and therefore Q must be violated. Therfore, we can find a 
The number of such sequences is at least |M|; each endpoint leads to 
a sequence, and at most two sequences collide ( Prop, lj). The number of endpoints is 2|M|. This 



ends the proof of Lemma 3 , and thus, the proof of Theorem 1 



5. MONOTONICITY ON HYPERGRIDS 



In this section, we prove Theorem 2 Let's recall the tester. We define H to be pairs that differ 
in exactly one coordinate, and furthermore, the difference is a power of 2. The tester chooses a pair 
in H uniformly at random, and checks for monotonicity among this pair. We describe the partition 
of H and make some (unimportant) technical arguments allowing for a more convenient setting. 

5.1. The partition of H and the issue of adequacy. Suppose we had a domain [k] n , where 
t = |"lg k~\ . For this domain, we partition H into 2n(£ + 1) sets H® b and b , 1 < a < n, < b < I. 

H a fi consists of pairs (x, y) which differ only in the ath coordinate, and furthermore \y[a] —x[a] \ = 2 b . 

For a pair (x,y) £ H a ^, exactly one among x[a] (mod 2 b+1 Hand y[a] (mod2 6+1 ) is > 2 b and one 
is < 2 b . This is because \y[a] — x[a]\ (mod2 6+1 ) = 2 b (since 2 divides the difference but 2 b+1 does 
not). We put [x,y) G H a ^ (with x -< y) in the set H® b if y[a] (mod2 6+1 ) > 2 b , and in the set H\ b 
if 1 < y[a] (mod 2 b+1 ) < 2 b . 

Note that each H® b and H^ b are matchings. However, some may not be perfect matchings 
(consider the matching Hl ). This technicality forced us to introduce the notion of adequacy of 
matchings. We will eventually prove the following theorem. 

Theorem 6. Let k be a power of 2. Suppose for every violation (x,y) and every coordinate a, 
\y[a] — x[a]\ < 2 C (for some c). Furthermore, suppose that for b < c, all matchings H® b ,H^ b are 
adequate. Then there exists a maximal matching M of the violation graph such that the number of 
violating pairs in H is at least |M|. 

First, we reduce the general cas e to this sp ecial case using a simple padding argument. Note 



that the following theorem implies Theorem 2 Henceforth, we will assume that k = 2 and that 
all matchings H^ b ,H^ b are adequate (for b < c, where 2 C is an upper bound on the coordinate 
difference for any violation). 

Theorem 7. Consider any function f : [k] n i— > R. At least an ej/(2n([~log fe] + l)-fraction of pairs 
in H are violations. 

Proof. Let k' = 2^ be the smallest power of 2 larger than 4k. Let us construct a function /' : 
[k'] n i — ^ Pl U {— oo,+oo}. Let 1 denote the n-dimensional vector all Is vector. For x such that 
all Xi G [k'/4: + l,k'/4 + k], we set f'(x) = fix — k'l/4). (We will refer to this region as the 



3 We abuse notation and define p (mod2 6+1 ) to be 2 b+1 (instead of 0) if 2 b+1 
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"original domain".) If any coordinate of x is less than k' /4, we set f'(x) = — oo. Otherwise, we set 
fix) = +00. 

We observe that all violations are contained in the original domain. For any violation (x, y) and 
coordinate a, \y[a] — x[a]\ < k < 2 e ~ 2 . Let H' be the matching corresponding to the domain [k'] n . 
For b < i — 2 (and every a), every point in the original domain participates in H' b . So, each of these 
matchings is adequate. Since every maximal matching of the violation graph has size > Efk n /2, by 



Theorem 6 the number of violating pairs in H' is at least Sfk n /2. 

The matching H is exactly the set of pairs of H' completely contained in the original domain. 
All violating pairs in H' are contained in H. The total size of H is at most nk n {\\ogk~\ + 1). The 
proof is completed by dividing £fk n /2 by the size of H. □ 

5.2. The potential <£. As in the hypercube case, the weight of a pair (x,y) is defined to be 
f(x) — f(y) if x -< y, and —00 otherwise. We will now pick M to be a maximum weight matching, 
as in the hypercube case, however, we will need this matching to have certain other properties as 
well. To that end, let's define msd(a) of a non-negative integer a to be the largest power of 2 which 
divides a. That is, msd(a) = p implies 2 P \ a but 2 P+1 j( a. We define msd(O) := I + 1. Now given a 
matching M, define the following potential. 

n 

(1) *(M):= J2 E msd (l^-^D- 

Now choose M to be the maximum weighted matching which maximizes <3?(M). As before, since 
M is a maximal set of violated pairs, we get |M| > \s f k n . To give some intuition for the potential, 
note that it is aligned towards picking pairs which differ in as few coordinates as possible (since 
msd(O) is large). Furthermore, divisibility by power of 2 is favored because the tester queries pairs 
which are 'powers of 2 apart'. 

We distribute M into 2n(£ + l) classes as we did for H: M® b and M^ b , for 1 < a < n and < b < i. 
M a £ := M® b U M^ b contains pairs (x,y) £ M which differ in the ath coordinate, and furthermore 
msd(|y[a] — x[a]\) = b. Note that every pair in M lies in at least one of the classes M a ^. We 
put (x,y) G Af° 6 if y[a] (mod2 b+1 ) > 2 b , and in M^ b if 1 < y[a] (mod2 6+1 ) < 2 b . Note that if 
(x,y) £ M® b , then x -< y. In summary, we divide M into classes based on which dimensions they 
differ in, the msd of the length, and the 'parity' of the endpoints. 

We let C r ab be the violated pairs in H^ b for r £ {0, 1}, 1 < a < n, < b < t. The following 
lemma is key. 



Lemma 7. For all r,a,b, we have \M^ b \ < \C 7 a 



b\- 



By our assumptions, if | Af^ fo | > 0, then H r ab for r £ {0, 1} is adequate. This lemma directly implies 
IThcorcm 61 



5.3. The proof of Lemma 7 Let's fix a, b. Suppose r = (the other case is analogous and 



omitted). Keeping this in mind, we now lose all superscripts and subscripts. As before, X is the set 



of endpoints of M . Since H is adequate, we can feed M, H, M to the machinery of §3 to obtain the 
sequences S x for all x £ X. Fix x £ X. Wlog, assume x -< y in the pair (x, y) £ M^7 Note that by 
the assumption r = 0, y[a] (mod2 b+1 ) > 2 b . Since msd(y[a] -x[a\) = 6, (y[a]-x[a\) (mod2 6+1 ) = 2 b . 
Therefore, x[a] (mod2 b+1 ) < 2 b . 

we will use St to denote S x (t). we will use s_i to denote y. we will also abuse notation to let Sj 
to sometimes denote Si[a]; we hope the context will disabuse. Recall that for even i, (si, Sj+i) £ H 
while for odd i, (sj,Sj + i) £ M. 
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Henceforth, for contradiction's sake we will assume that (sj, Sj+i) ^ C for any even i. We will 
show that S x cannot terminate in that case. The following lemma is captures the structure of the 
neighboring pairs in S x if there are no violating pairs. We would like to point out that the lemma 
below is more involved than Lemma 4 The reason is that there is no easy analog to |Prop.~2 This 
relates to what we mentioned in the introduction, that is, in the hypercube, if (x, x') is an edge 
across the rth dimension, then x r = implies x' r = 1. For hypergrids that is not true. In fact, we 
will need the extra "^-maximizing" property of the matchings for the lemma to go through. 

Lemma 8. If i = 1 (mod4) ; 7/i = 3(mod4) 

(i) Si >- Si-\. (v) Si -< Si-!. 

(ii) s i+1 y Si. (vi) s i+ i ~< Si. 

(iii) Si (mod2 b+1 ) > 2 b . (vii) Si (mod2 b+1 ) < 2 b . 

(iv) s m (mod2 b+1 ) > 2 b . (viii) s m (mod2 6+1 ) < 2 b . 

Proof. The proof is by induction, and is similar to |Lemma 4| with some crucial differences in part 
(i). For parts (iv) and (viii) we will assume Sj+i = M(sj) exists. 

(i) The base case of i = 1 follows since we have assumed r = 0, and therefore as argued above, 
so = x[a] (mod 2 b+1 ) < 2 b . Suppose for some i = 1 (mod 4) we get Si ~< Sj_i and for all j < i the 
lemma is indeed true. Since (sj_i,Sj) 6 H, Sj_i y Sj implies Sj_i (mod2 b+1 ) > 2 b (recall r = 0). 
We now exhibit a different matching M' with larger <£() value, contradicting the choice of M. 

Define the set of edges: E_ := {(sj,Sj + i) : j odd ,—1 < j < i}, and E + := (s_i,si) U 
{(sj_i, Sj+2) : j odd ,1 < j < i — 4} U (sj_i, £4-3). By induction, for 1 < j < i — 4, the pair 
(sj-i, Sj+2) is precisely (sj © e a , Sj + ± © e a ). Here, © is the coordinate-wise sum, and e a is a vector 
with O's on all coordinates but a, and either +2 b or — 2 b on the ath coordinate depending on whether 
j = 3 (mod 4) or 1 (mod 4), respectively. Note that this argument requires Sj and Sj + i to have the 
same (mod2 6+1 ), which is guaranteed by the induction hypothesis. Since (sj, Sj+i) was in M, the 
pair (sj-i, Sj+2) is a valid pair as well. Furthermore, if Sj y- Sj+i, then Sj-i >~ Sj+2, and a similar 
statement is true with -< replacing y. 

The above shows that we can swap E— by E+ from M to get M' without changing the matched 
end points. Now for the weights. The weight of E-, by induction, is W- = [/(so) — /( s -i)] + 

[/(si) - f(s 2 )] + [/(s 4 ) - /(s 3 )] H h [/(sj_i) - /(si_ 2 )]. Observe the signs changing from term to 

term due to induction hypothesis. Similarly, we get w(E+) = [f(si) — /(s_i)] + [/(so) — /(S3)] + 
• • • + [/(si_ 5 ) - /(s 4 _ 2 )] + [/(s 4 -i) - /(a*- 3 )]. Therefore, w(E_) = w(E + ) and w(M') = w(M). 

Note, for odd 1 < j < i — 4, we have |sj+i — Sj| = |sj+2 — s j— 1|- Thus, the only pairs which 
affect $ are the pairs (s_i, so), (sj_i, Sj_2) in M and (s_i, si), (sj_i, s«_3) in M'. The following 
claim proves that if s» -< Sj_i, then $(M') > $(M). 

Claim 1. msd(|s_i - si|) +msd(|si_i - Sis\) > msd(|s_i - s |) +msd(|si_i - Sj_2|). 

Proof. Note that msd(s_i — so) = 6, by definition, since it lies in M® b . Furthermore, s\ = so + 2 b . 
The following easy observation implies that msd(s_i — s\) > b + 1. 

Observation 1. For integers b,p, if 2 b \p and 2 b+1 )fp, then 2 b+l \ p±2 b . 

Now we show that msd(|sj_i — Sj_3|) > msd(|sj_i — s^_2 1 ) ■ Let the RHS be b' . Note that Sj_3 — Sj_i = 
Sj_2 — Si-i + 2 b , since by induction Sj_2 (mod 2 b+1 ) < 2 b , and (sj_ 3 , Sj_2) G A"- Thus, if b' < b, then 
2 b I |sj_2 — Sj_i| implies 2 b \ \si-3 — Sj_i| as well. Thus, it suffices to show b' < b. Suppose not, 
and b' > b + 1. This implies 2 b+1 \ (sj_ 2 - Sj-i). By induction, we get that Sj_ 2 (mod2 b+1 ) < 2 b . 
By supposition, we have Sj_i (mod2 b+1 ) > 2 b . Contradiction. □ 

(ii) Suppose for some i = 1 (mod 4) we get s, y Sj+i and for all j < i the lemma is indeed 
true. Delete the set of M-edges EL := {(sj, Sj + i) : j odd , — 1 < j < i}, and add the set of edges 
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E + := (s_i, s\) U {(sj-i, Sj+2) : j odd , 1 < j < i — 4} U (sj_3, Sj + i). As in the above case, check 
that M — E- + E + is a valid matching (this uses the induction hypothesis, as above) which leaves 
unmatched. Now we consider the weights. The weight of by induction, is 

W_ = If (s ) - + [/(si) - /(* 2 )] + [/(a 4 ) - /K>] + • • • + - /(*i- 2 )] + [/(^+i) - /(*)] 

Observe the signs changing from term to term due to induction hypothesis, except for the last 
term which is assumed for the sake of contradiction. Similar to the previous case, we get w(E + ) =: 
W+ = [f( Sl ) - /(s_i)] + [f(s ) - f(s 3 )] + ■■■ + [f(8i- 5 ) - /(«i_ 2 )] + [/(*) - /(*-3)]. Thus, we get 
the weight of the new matching is precisely w(M.) — W- + W + = u;(M) + /(sj) — /(sj_i). We proved 
in (i) that Sj >~ Sj_i, and since (sj_i,Sj) is not a violation, we get /(sj) > /(sj_i) contradicting 
the maximality of M. 

(iii) For i = 1 (mod 4), we have (sj_i,Sj) G i?. We have proved in (i) that Sj y Sj_i. Therefore 
(since r = 0), (mod2 b+1 ) > 2 b . 

We now note that parts (v), (vi), and (vii) can be proved similarly as the three cases above. We 
do not repeat them here. We now show (iv) and (viii) follow as easy corollaries. 

(iv) ,(viii) For i = 1 (mod 4), if Sj+i (mod2 fe+1 ) < 2 b , then Sj + 2 >~ ^j+i, which contradicts (v) for 
i + 2 = 3 (mod 4). For i = 3 (mod 4), if Sj+i (mod2 6+1 ) > 2 fe , then Sj+i >- which contradicts 
(i) for i + 2 = 1 (mod 4). □ 

Armed with this handle on the ancestor-descendant relationships, we can prove the progress and 
disjointedness lemmas alluded to in \ 



Lemma 9. For odd i, S{ is not ~NL-unmatched. 

Proof. Suppose it is. Then as in the proof of the previous lemma we can find a better matching. 
Once again, assume i = 1 (mod 4) (leaving the other case out since it is analogous). We delete 
the set of edges EL := {(sj,Sj+i) : j odd , — 1 < j < i — 2} and add the set of edges E+ = 
(s_i, s\) U {(sj-i, Sj+2) : j odd , 1 < j < i — 3}. As in the above lemma, we get that M — £L + E + 
is a valid matching whose weight is, as before, io(M) + /(sj) — /(sj_i) > u>(M) since S{ >~ Sj_i and 
we have assumed there are violated pairs (sj_i, Sj). □ 

Lemma 10. For odd i, Sj ^ X. 

Proof. We need to sh ow that (sj, Sj+i) ^ M. Suppose i = 1 (mod 4) (the other case has the same 



argument). Lemma8 (iii), (iv) shows that both Sj (mod2 6+1 ) and Sj+i (mod2 b+1 ) are > 2 b . Now, 
if the remainders are same then 2 b+1 \ \si — s«+i| implying msd(|sj — Si+i|) > b which in turn implies 
(si, Sj+i) ^ M. If the remainders are not same, then Sj+i| < 2 b . This implies that 2 b / |sf — Si+i] 
which again implies (sj, Si+i) ^ M. □ 

We conclude that for any x G A, if no (sj, Sj+i) G C for even i, then S x can never terminate. The 
non-termination contradictions |Prop. l[ and therefore our supposition must be wrong. This ends 



the proof of Lemma 7 , and thus, the proof of Theorem 2 



6. A PSEUDO-DISTANCE FOR L a fi 

A key concept that leads to the unification of Lipschitz and monotonicity is a pseudo- distance 
defined on D. This distance provides a lot of power in manipulating the alternating paths for more 
general properties. The monotonicity proof requires no distance, so generalizing it for Lipschitz 
properties is quite non-trivial. An important feature of the distance is the triangle inequality. The 
challenge faced in the final proof is tweezing out all the places in the previous argument where the 
distance function is "hidden" . This involves replacing many equalities in the monotonicity argument 
with inequalities (going the "right way") based on the triangle inequality of this distance. 
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We begin by defining a weighted directed graph G = (D,E) where D in this section is the 
hypergrid [k] n . E contains directed edges of the form (x,y), where \\x — y\\% = 1. The length of 
edge (x, y) is gives as follows. If x -< y, the length is —a. If x y y, the length is j3. 

Definition 2. The function d(x,y) between x,y G D is the shortest path length from x to y in G. 

This function is asymmetric, meaning that d(x,y) and d(y,x) are possibly different. Furthermore, 
d(x,y) can be negative, so this does not truly qualify to be a distance (in the usual parlance of 
metrics). Nonetheless, d(x,y) has many useful properties, which can be proven by expressing it in 
a more convenient form. Given any x, y G D, we define hcd(x, y) to be the zeD maximizing ||z||i 
such that x y z and y y z. That is, z is the highest common descendant of x and y. Note that if 
x y y then hcd(x, y) = y. 

Claim 2. For any i,i/£D, d(x,y) = (3\\x — hcd(x,y)||i — a\\y — hcd(x,y)||i. 

Proof. Let us partition the coordinate set [n] = A U B U C with the following property. For all 
i G A, Xi > y%. For all i G B, Xi < yi, and for all i G C , x% = yi- Any path in G can be thought of as 
sequence of coordinate increments and decrements. Any path from x to y must finally increment 
all coordinates in A, decrement all coordinates in B, and preserve coordinates in C. Furthermore, 
any increments adds —a to the path length, and a decrement adds (5. 

Fix a path, and let Ii and Di denote the number of increments and decrements in dimension i. 
For i £ A, Di = Ii + \xi — yi\, for i G B, Ii = Di + \x{ — yi\, and for i G C, Ii = Di. The path length 
is given by 

Y,w D i - + Yj^ Di - + - aI i) 

i£A ieB ieC 

J3\xi - yi\ + Ii(J3 - a)} + J2[-a\xi - Vi \ + A(/3 - a)] + ^ £(0 - a) 



iGA iS-B ieC 



For the inequality, we use the fact that (3 > a. Let z = hcd(x, y). Note that Zi = min(xj,yj). 
Consider the path from x that only decrements to reach z, and then only increments to reach y. 
The length of this path is exactly /3 ^2 i€ ^(xi — yi) — a Y^ieB( y i ~ x *)> w hich is the RHS in the lemma 
statement. □ 

It is instructive to keep in mind what this distance translates to in the case of monotonicity and 
Lipschitz. In the case of monotonicity (when a = 0, f3 = oo), we get d(x,y) = oo unless x ~< y 
in which case d(x,y) = 0. In the case of Lipschitz, the distance d(x,y) is precisely the Hamming 
distance d(x, y) = \ \x — y\\i. The next two claims establish some properties of the pseudo-distance. 

Claim 3. (Linearity) If x y z y y or x -< z <y, d(x, y) = d(x, z) + d(z, y). 
(Triangle Inequality) For any x,y,z G D, d(x, y) < d(x, z) + d(z, y). 

(Projection) Let x,y G D and v be a vector whose only non-zero coordinate is a. Let x' = x © v 
and y' = y © v where © is the coordinate wise sum, and furthermore suppose x',y' G D. Then 
d(x,y) = d(x',y'). 

(Positivity) Ifd(x,y) = 0, then d(y,x) > 0. 



Proof. The linearity property follows from Claim 2 Suppose x y z y y. We have hcd(x,y) = y 



hcd(x, z) = z, and hcd(y, z) = y. Hence, d(x,y) = f3\ \x — y\ |i = (5{\\x — z\\\ + \\z — y\\i) = d(x, z) + 
d(z,y). The other case is analogous. 

The triangle inequality follows because d(x, y) is a shortest path length. For the projection 
property, let z = hcd(x,?/) and let z' = hcd(x',y'). Note that z and z' also differ only in the 
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ath coordinate by the same amount v a . Thus, \\x — z\\\ = \\x' — z'\\\ and \\y — z\\\ = \\y' — z'\\\, 
implying d(x,y) = d(x',y'). Suppose the positivity property does not hold. So d(x,y) = and 
d(y, x) < 0. Hence, /3||x — z\\\ = a\\y — z\\\ and f3\\y — z\\i < a\\x — z\\\. Adding, we get /? < a, a 
contradiction. □ 

The following lemma connects the distance to the property £ a ,i3- 

Lemma 11. A function is (a, /3)-Lipschitz iff for all x,y £ D, f(x) — f(y) — d(x,y) < 0. 

Proof. Suppose the function satisfied the inequality for all x, y. If x and y differ in one-coordinate 
by 1 with x >- y, we get f(x) - f(y) < d(x,y) = 13 and f(y) - f(x) < -a implying / is {a, 13)- 
Lipschitz. Conversely, suppose / is (a, /3)-Lipschitz. Setting z = hcd(x,y) (for x,y £ D), we 
get f(x) - f(z) < f3\\x - z\\i and a||y-z||i < f(y) - f(z). Summing these, f(x) - f(y) < 
(3\\x - z\\\ - a\\y - z\\i = d(x,y). □ 



The next lemma is a generalization of Theorem 5 which argued that the size of a minimum vertex 
cover is exactly £/2 n . A similar statement is also known for the Lipschitz property, and we prove 
this for generalized Lipschitz functions. We crucially use the triangle inequality for d(x,y). 

We define an undirected weighted clique on D. Given a function /, we define the weight w(x,y) 
(for any x, y £ D) as follows: 

(2) w(x,y) := max (/(a?) - f(y) - d(x, y), f(y) - f(x) - d(y, x)j 

Note that although the distance d is asymmetric, the weights are defined on an undirected graph. 
Lemma [IT] shows that a function is (a, /3)-Lipschitz iff all w(x, y) < 0. Once again, it is instructive to 
understand the special cases of monotonicity and Lipschitz. For monotonicity, we get that w(x, y) = 
f(x) — f(y) when x -< y and — oo otherwise. For Lipschitz, we get w{x, y) = \f(x) — f(y) \ — \\x — y\\\. 

We define the violation graph as VGf = (D, Vf) wher e Vf = {(x, y) : w(x, y) > }. The violation 
graph is unweighted. The following lemma generalizes Theorem 5 from |FLN + 02| . 

Lemma 12. The size of a minimum vertex cover in VGf is exactly sAT)\. 

Proof. Let U be a minimum vertex cover in VGf. Since each edge in VGf is a violation, the points 
at which the function is modified must intersect all edges, and therefore should form a vertex cover. 
Thus, £/|D| > \U\. We now show how to modify the function values at U to get a function /' with 
no violations. We invoke the following claim with V = D — U, and f'{x) = /(x), Vx G V. 

Claim 4. Consider partial function f defined on a subset V C D ; such that for all Vx, y E 
V, f'(x) — f'(y) < d(x,y). It is possible to fill in the remaining values such that Vx,y G D ; 
f'(x)-f'(y)<d(x,y). ' 

Proof. We prove by backwards induction on the size of V. If \V\ = 0, this is trivially true. Now 
for the induction step. It suffices to just define /' for some u ^ V. We need to set f'(u) so that 
f'{u) — f'(y) < d(u, y) and f'{x) — f'{u) < d(x,u) for all x,y E V. Let us first argue that 

m := max (/(x) — d(x, u)) < min (f(y) + d(u, y)) =: M 

Suppose not, so for some x, y £ V, f'(x) — d(x, u) > f'(y) +d(u, y). That implies that f'(x) — f'(y) > 
d(x,u) + d(n, y) > d(x,y) (using triangle inequality). That violates the condition, so m < M. We 
can therefore set f(u) £ [m, M] and ensure that Vx, y £ V U {u}, f'(x) — f'(y) < d(x, y). □ 

This gives a function /' such that A(/, /') = |J7|/|D|. By |Lemma 1 1 /' is (a, /3)-Lipschitz, and 
\U\ > e/|D|. Hence, \U\ =e/|D|. □ 

The following is a simple corollary of the previous lemma; it follows since the endpoints of any 
maximal matching forms a vertex cover. 
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Corollary 1. The size of any maximal matching in VGj is > |ej|D|. 

Below, we make a technical claim that allows for easier arguments about w. Essentially, by a 
perturbation argument, we can assume that w(x,y) is never exactly zero. Note that this justifies 
the strict inequalities we have encountered so far. 

Claim 5. For any function f, there exists a function f with the following properties. Both f and 
f have the same number of violated edges, Ef = £f>, and for all x,y £~D, Wf(x,y) ^ 0. 

Proof. We will construct a function /' such that Wf/(x,y) has the same sign as Wf(x,y). When 
Wf(x,y) = 0, then Wf(x,y) < 0. Since exactly the same pairs have a strictly positive weight, 
their violation graphs are identical. Both functions have the same number of violated edges and 
by Lemma 12 Ef = Ef. 

Set f (x) = (1 — rjf)f(x) + cr/||x||i, where r\f and aj are very small (say, rjf = ^r, and o~f = 
where L is the precision of /). We have f'(x) — f'(y) = (1 — rjf)(f( x ) ~ f(y)) + °7 (IMIi ~~ IMIi)- 

If f(x) ^ /(y), then f'{x) — f'(y) — d(x,y) has the same sign as f(x) — f(y) — d(x,y). Under 
this circumstance, when Wf(x,y) 7^ Wf(x,y) has the same sign. 

Suppose f(x) = f(y). If Wf(x,y) is non-zero, then (since aj is so small) Wf(x,y) maintains 
the sign. So assume that Wf(x,y) = 0. Wlog, d(x,y) = 0, so by Claim 2 d(y,x) > 0. Setting 
z = hcd(x,y), we get 0\\x — z\\i = a\\y — z\\\ and a\\x — z\\\ < /3\ \y — z\ |i. Adding and using the 
fact that a + (3 > 0, \\x — z\\± < \\y — z\\\. Hence ||:c||i — ||y||i < 0, and f'(x) — f'{y) < 0. Therefore, 
Wf/(x, y) < 0. □ 

7. Generalized Lipschitz Testing on Hypergrids 



In this section, we prove Theorem 4 



Intuitively, with the distance d(x, y) in place, the basic spirit 
of the monotonicity proofs can be carried over. The final proof, however, is much more complex 
and requires many algebraic manipulations. We do not explicitly have the "directed" behavior of 
monotonicity that allows for many of rewiring arguments to be performed. The properties of the 
distance provide the tools to rewire the matchings. 



We borrow the definition of H from §5 the generalized Lipschitz tester picks a pair (x,y) G H 

y|k</W 



at random, with say x >- y, and checks if a\\x — y\\i < f{x) — f{y) < fi\\x — y\\\. The weight of an 
edge (x,y) is as defined ([2]). As in §5, we choose M to be the maximum weight matching which 
maximizes $(M) as defined by (fiT). The classification of M, and partition of H into 2n(£ + 1) 
classes, are also borrowed from |§5j and in the remainder of the section we will prove |Lemma7] in 
the context of generalized Lipschitz. 

and fix a, b. We lose the sub/superscripts 



As in the proof of Lemma 7 , we focus on the case of r 



and feed M, H, M into the machinery of[§3]to obtain the sequences S x for all the endpoints 

of M. We fix x £ X and assume x[a } (mod2 b+1 ) < 2 b . We let s t denote S^t), and also at 
times abusing notation, let it denote st[a\. We restate that for even i, (sj,Sj+i) G H; for odd i, 
(si, Si+i) G M. For contradiction's sake, we assume for all even i, (s^ Sj+i) satisfies the generalized 
Lipschitz property. That is, if Si >- s^+i, then a2 b < f(si) — f(si + i) < (32 b , else the inequalities are 



reversed. Note that we have strict inequalities; this follows from Claim 5 Till now, we have just 
mimicked what we had done in §5 However, the inherent directionality in the monotonicity case 



led to simple weights and therefore (in comparison what is to follow) simpler (to read, at least) 



proofs. The proof of Lemma 7 for generalized Lipschitz is quite involved, and needs some notation. 



Notation. Let y = M{x). We denote y by s_i as well. The weight w(y, x) is given by max(/(x) — 
f(y) — d(x, y), f(y) — f(x) — d(y, x)). To abstract out these two cases cleanly, we define the following. 

• Order dependent functions d_i and di: di(x, y) = d(x,y) and d_i(x, y) = d(y, x). 

• The marker bit b: If w(y,x) = f(y) — f(x) — d(y,x), then b = 1. Otherwise, it is —1. 
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• The function a(y, x, b) := b(f(y) - f(x)) - d b (y, x). 

• Indicator /Xj which is 1 if i = 1 (mod 4) and if i = 3 (mod 4) . 

Note that for any two points u, v £ [k] n , w(u,v) > a(u,v,b) for any b G {— 1,+1}. Also note that 
w(y,x) = b(f(y) — f(x)) + db(y,x) = cr(y,x, b). The marker bit introduces the "direction" in the 
pairs on M. 

Pair sets E^(i) and E+{i): As in the monotonicity case, one of the main aspects of the argument 
is modifying the matching M by deleting and inserting some pairs and finding a 'better' matching. 
This will lead to various inequalities involving / and d-values. The pairs added will depend on the 
statement we wish to prove. Nonetheless, there is a core set of common pairs. The matching M is 
modified by removing all pairs incident to S x , up to (but not including) Sj. What do these pairs 



look like? The reader may find it useful to consult Fig. 2 for reference. The pairs are (y, x), (si, s 2 ), 
(•53,^4), . . ., (si-2, Si-i). (For i = 1, this is just (y,x). As long as M(sj) is defined for 1 < j < i, 
this sequence of pairs is well-defined.) This leads us to define the subset E^(i) C M. The minus is 
to denote pairs to be removed. For later convenience, we split the union in two groups. 

E-(i) = {(y,x)}U{{ Sj ,s j+1 ) : j is odd ,1 < j < i - 2} 

= {(y,x)} U {(s U +l,su+2) :0<£< L*/4J - m} U {(% +3 ,s« + 4) : < t < [i/4\ - 1} 

Note that |i?_(i)| = ^r 1 . The aim is to select a set whose weight can be compared to w(E-(i)). 
We will prove shortly that this weight (sort of) looks like a(y, x, 1) + a(si, S2, —1) + 0"(s3, S4, 1) + 
°"( s 5) s 6> — 1) • • •• I n other words, the bit argument keeps switching. Let us focus on the /(•) terms in 
w(E_). We have [f(y)-f(x)} +[f(s 2 )-f( Sl )} +[/( S3 )-/( a4 )] +[/( a6 )_/( a5 )] + [/( S7 )_/( a8 )]+. . . . 
We wish to pair these up differently but maintain the same "weight structure". We will always 
pair terms with odd and even indices together (except for y). We start with (y,si). Now, x = sq 
needs to paired with an odd indexed Sj with f(sj) with a negative coefficient. So we get (so,ss). 
The next to be paired is S2, which we manage by (^2,55). Then we get (54,57). We want to stay 
on vertices used in E_(i), so we will not involve Sj. Formally, 

E +(i) = {{y,si)}U {(sj,s j+3 ) : j is even,0 < j < i - 5} 

= {(y, si)} U {(s 4 £, ^+3) : < £ < - 1} U {(s 4 £-2, s u +i) ■ 1 < I < LV 4 J ~ Mi} 
Note that \E + (i)\ = '^. 

Proposition 3. The pairs in E_{i) exactly involve all vertices in {sj : —1 < j < i — 2}. The pairs 
in E + {i) exactly involve vertices in E_(i) \ {sj_3,Sj_i}. 

We now make some useful definitions that come close to capturing the weights of E- and E + . 
Definition 3. For odd i suppose Si exists. Define sums VF+(i) and W-(i) as follows. 

W-(i) = a(y,x,b) + ^ <?(su+i, He+2, -&) + ^ a(su+3, su+4,b) 

e=o e=o 
Li/4J-1 L*/4J 
W+(i) = cr(y,si, b) + ^ er(s 4 £, su+3, -b) + ^ cr(su-2, S4,e+i,b) 

£=0 1=1 

For example, when i = 9, b = 1, we get W-(9) = a(y,x,l) + a(s\, S2, —1) + cr(s3,S4,l) + 
f(*5) s 6> — 1) + 0"(s7> S8 5 — 1). Note the alternating bits. VF + (9) = a(y,si,l) + (j(so,S3,— 1) + 
0"(s2, S6, 1) + er(s4, S7, —1)- Note that sg, S5 are missing (as they are missing in E + (9). 

The following claim calculates the difference between W-(i) and W+(i). We prove this claim 
after proving our main lemma below, where this claim will be used. 
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Claim 6. For odd i, 

w + (i) - w-(i) = (-ir* +1 b(f(*i-i) - f(»i-a)) - My, ai) + My, x) + d ( _ 1)fS+ i B ( ai _ 2 , ^-0 

As in the monotonicity case, we need to understand the weights of the pairs of M in S^. For 
odd i, we know that w(si, Si+i) is max((j(si, Sj+i, — 1), a(si, Sj+i, 1)), but which value does it take? 
To execute the argument described above, we need to know this. It turns out that this is exactly 
decided by (1$, and therefore has a very consistent behavior. This is the real workhorse of the proof, 
and brings out the directionality required for our rewiring. The following lemma is analogous to 



Lemma 4 in §4 and Lemma 8 in [55 



Lemma 13. For odd i, suppose Sj+i = M{si) is defined. Then, 

i. w(si,s i+ i) = a(si,s i+1 , (-l)^b). 

ii. Ifi = 1 (mod 4), s i+1 (mod2 fe+1 ) > 2 b . 

iii. Ifi = 3 (mod 4), s i+1 (mod2 b+1 ) < 2 b . 

Proof. The proof is by induction over i. For i = — 1, we are looking at w(s-i,sq) = w(y,x). 
fj,-i = 0. The parameter b was chosen so that w(y,x) = o~(y,x, b). Therefore, (i) holds. We also 
assumed x (mod2 b+1 ) < 2 b . We now perfom the induction step. For an odd i, suppose the lemma 
is true for all odd j < i. 

(i.) Suppose for some i, w(si, Sj+i) = o"(sj, Sj + i, (— l) Mi+1 b). As explained earlier, we will define a 
set of M-pairs E rem and another set of pairs E a dd- We choose E rem := E-(i) U |(sj, Sj+i)}. The 
set of new pairs, E a dd is defined as E + (i) U {(s,_3, Sj+i)}. Observe that E + {i) does not involve Sj_3 
(the largest even index involved is i — 5), so this is a valid set of matched pairs. 

We now compute the weights of edges in these sets. By induction, note that we have w(E_(i)) = 
W-(i) and therefore, we get the following. Note the last term is the one we intend to contradict. 

w(E rem ) = W-(i) + o-( Si ,s i+1 , (-l)^ +1 b) 
We can also lower bound w{E a dd) as follows: 

[i/4j-i L*/4J+w 
w(E a dd) > <r(y,s 1 ,b)+ ^ cr(s 4 £, s 4 ^ +3 , -b) + ^ cr{su-2, su+i, &) + a(si_ 3 , s i+1 , (-l) Ml+1 b) 

e=o i=i 
= W + (i) + a( Sl - 3 ,s i+1 ,(-ir* +1 b) 
Therefore, we get 

w(E a dd) - w(E rem ) > W+(i) - W-(i) + <r( Si _3, fli+i, (-l)^ +1 b) - a( Si , s i+1 , (-l)^ +1 b) 
= W+(i) - W-(i) + (-ir +1 b (/( Si _ 3 ) - /(*)) 
+ d(_ 1 )^+i f) (si,Sj + i) - d(_ 1 ) Mi +i b (si_ 3 , Sj+i) 

(3) > ^ + (i)-^_(i) + (-l)^ +1 b(/( Si _ 3 )-/( S 0)-d ( _ 1) M l+ i b (^-3,s i ) 

The last inequality follows from triangle inequality (Prop. 1) of d, and therefore d& for any bit b. 
Now we use Claim 6 connecting W-(i) and W+(i). Combining Claim 6 with ([3]), we get 

w(Eadd) ~ w(E rem ) > (-l)^ ;+1 b(/(si_i) - f(si)) - d b (i/,si) + d b (y,x) + d ( _ 1)Ml +i b (si_ 2 ,Si_i) 

(4) - d ( _ 1)Mi +l 6 (Si_ 3 ,5i) 

By Prop. 3 , we can remove £ rem and add E a dd to get a valid matching. Because M is a maximum 
weight matching, w(E rem ) > w(E a dd)- This can be used to get a bound on /-value difference 
between two adjacent vertices as follows. 

Claim 7. For odd i, 

(-1)^ b(/( Si „!) - /(*)) > d b (y,x) - d b (y, Sl ). 
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Proof. We do the case of i = 1 (mod 4); the other case is analogous. Hence, (— l) w b = — b. 
Substituting in Q, 

w{E add ) - w(E rem ) > b(/(s;_i) - /(sj)) -d 6 (y,si) + d b (y,x) 

+ d&(Si_ 2 , Sj_i) - d b (Sj_3, Si) 

We claim that d(,(sj_2, Si-i) = d[,(sj_3,Si) using the projection property of d ( |Claim 3 ), and thus 
dt,- That is, we claim that (sj_3,Sj) = (sj_2 © v,Si-\ © v) for some vector v having non-zero 
coordinate only in the ath coordinate. To see this, first note that (sj_3,Sj_2) and are in 

H. Therefore, it suffices to check that Sj_ 3 and Sj_i are on 'different sides' of 2 b modulo 2 b+l . But 
this is precisely what we get by induction from (B,C). Now we use that M is a maximum weight 
matching. Since w{E add ) - w(E rem ) < 0, b(/(sj) - /(sj_i)) > d b (y, x) - d b (y, si). □ 

This contradicts our main assumption and completes the proof of (i). 

Claim 8. // (— l) Mi b(/(si_i) — /(sj)) > d b (y,x) — d b (y, s\) , then (sj_i,Sj) is a violating pair. 

Proof. There are four cases. We do one of them. Suppose b = l,i = 1 (mod 4) and thus fii = 1. Then 
we have f(si) — f(si-i) > d(y,x) — d(y,si). Since r = 0, the RHS is precisely /32 b . Furthermore, 
by induction we have that Si-i (mod2 6+1 ) < 2 b . So, Si >- Sj_i. Note that (sj-i, Si) G H. If this is 
not a violating pair, we must have f(si) — /(sj-i) < (32 b . Contradiction. □ 

We move to proving parts (ii),(iii). 

((ii),(iii)) We perform the proof for 1 (mod 4); the other case is analogous. Suppose for some 
i = 1 (mod 4), Sj + i (mod2 6+1 ) < 2 b . Once again, we will recognize a set E rem and E add such that 
M — E rem + E add is a valid matching M'. Unlike in case (i), the contradiction obtained will not 
by finding a violated pair. Rather, we will show that <J>(M') > <E>(M) contradicting the choice of 
M. This will be the main difference from the previous case's analysis. However, the calculations 
are similar, unfortunately not exactly the same. 

As in the previous case, we choose E rem := E^{i) U {(sj, Sj+i)}. The set of new pairs, E add is 
defined as E + (i) U {(sj_i, Si+i)} U {(sj, Sis)}. By Prop. 3, E + {i) does not involve {sj_3,Sj_i} (the 
largest even index involved is i — 5), so this is a valid set of matched pairs. We now compute the 
weights of edges in these sets. As in case (i), by induction, note that we have w(E-(i)) = W-(i) 
and therefore, we get 

w{E rem ) = + a( Si , s i+1 , (-l)^b) 

Note the difference in the last term of the RHS; in case (i), for contradiction we assumed it to have 
the opposite sign. Now, by induction we know the weight. Thus it is crucial we prove (i) before 
(ii),(iii). 

We lower bound w(E add ) as follows: 

L*/4J-i LV4J-« 
w(E add ) > a(y,s 1 ,b)+ ^ o-(s 4£ , s M+3 , -b) + ^ o-(s 4e _ 2 , s M+1 ,b) 

e=o i=i 
+ (7(ai_i, Si+i, (-l)^b) + a( Si . 3 , Si , (-l)^ +1 b) 
= W+(i) + a( Si - 1: s i+1 , (-l)^b) + a( Sl - 3 , s u (-l)^ +1 b) 

Subtracting, we get 

w(E add ) - w(E rem ) > W+W-W-W + aisir-uai+i^-VjHb) 

+ a( Si _3, Si, (-l)^ +1 b) - a( Si , 8i+1 , (-l)^b) 
= W+(i) - W.(i) + - /( Si _ 3 )) 

- d(— i)^Hb(si-i, - d(_ 1 )M l +i b (sj_3, Sj) + d(_iy«iji(si, Sj+i) 
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Now, combining with 



Claim 6 



w(E add ) - w(E rem ) > - d b (y,si) + d b (y,x) + d(„ 1) M l +i t ,(s i -2,Si-i) 

— c '(— i)^Hb(si-i, Sj+i) — d(_ 1 ) Ml +i b (si_3, Sj) + d(_ 1 )M 4b (si, Sj+i) 
= d b (y, x) - d b (y, si) - d ( _ 1 )^ b (sj_i, s m ) + d ( _ 1) M lb (s i , s i+ i) 
(5) > d b (y,x) - d b (y,si) - d { _ 1)Ml +i b (si,Si_i) 

The equality follows since df,(sj_2, Sj_i) = d&(sj_3,Si) by the projection property of dr 1 ^ fH +i b 
(Claim 3), and the observation, as in the proof of Claim 7 that (sj_3,Sj) = (sj_2 © f,Sj-i © v) 
for some vector v having non-zero coordinate in the ath coordinate. The second inequality above 
follows from triangle inequality. 

The following claim proves that w(E ac id) > w(E rem ) and thus u>(M') = iu(M). 
Claim 9. For odd i, d b (y, x) - d b (y, s\) = d ( _ 1)Ml +i b (si, Sj_i). 

Proof. Fix b = 1 (since it appears in all terms). By our assumption that r = 0, we get that 
d(y, x) — d(y, si) = (32 b . Suppose i = 1 (mod 4), that is, m = 1. Then, by induction we know that 
Sj-i (mod2 6+1 ) < 2 b . Thus, = Si_i + 2 b , implying d ( _ 1)Ml +i («., Sj_i) = /32 fe . If t = 3 (mod 4), 
that is /ij = 0, we get d^_ 1 y i +i(si, Si-i) = d(sj_i,Sj). By induction, Sj_i (mod2 6+1 ) > 2 b . So, 
d(si_i,Sj) = /32 6 as well. □ 



To complete the contradiction, we show that <3?(M') > <3?(M). Note that by induction, the msds 
of the pairs in E- is precisely that of the pairs in E + U (sj, Sf-3). This is because (si_3,Sj) = 
(sj-2 © v, Si-i © u), for some vector v, as argued above. Therefore, 3>(M') — f>(M) is precisely the 
difference in the LHS of the claim below, and thus > 0. 

Claim 10. msd(|sj + i — Sj_i|) + msd(|si — s_i|) — msd(|sj+i — Sj|) — msd(|s_i — sq\) > 0. 



Proof. The proof is very similar to that of Claim 1 As in that proof, we get msd(|si — S-i|) — 
msd(|s_i — so|) > 0. All we need to show is msd(|sj + i — Sj_i|) > msd(|sj+i — Sj|). By induction, we 
get that Si-! (mod2 b+1 ) < 2 b . Since (s;_i,Si) G H, we get s; = s»_i + 2 b and (mod2 6+1 ) > 2 b . 
For contradiction's sake, we have assumed s«+i (mod2 b+1 ) < 2 b . Thus, the largest power of 2 that 
divides |sj+i — s»| is < 6. Since |sj+i — s»_i| = |sj+i — Sj + 2 fe |, we get that msd(|sj+i — Sj_i|) > 
msd(|sj+i - Sj|). □ 



This completes the proof of part (ii), (iii), and thus the proof of Lemma 13 



□ 



All that remains is the proof of Claim 6 , which we perform next 



Proof of Claim 6: Recall we need to prove that for odd i, 



W + (i) - W_(») = (-l)"*+ib(/( Si _ 1 ) - f( Si _ 3 )) - d b (y, si) + d b (y, x) + d ( _ 1)Ml+ i b (s i _ 2 , s^) 
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We expand out the function a in the definitions of W—(i) and W+ (i) to get longer (but similar 
looking) expressions. 

L«/4|-/Ui 

W-(i) = b(f(y) - f(x)) - d b (y,x) + ^ [-b(f(s u+1 ) - f{s u+2 )) ~ d-bi^U+i, Sii+2)] 



e=o 



L'/iJ-i 



1=0 

Li/4J L*/4j-w LV 4 J-« U/4J-1 

1=0 1=0 t=o 1=0 

[>/4j- w LV4J-1 

db(y,^)- X] d -b(«4£+l> s 4£+2) - ^ d b(su+3, S41+4) 
1=0 1=0 



[i/4J-l 

= b(/(y)-/(si))-d b (y,si)+ ^ [-b(/(s4£) - f(su+z)) - d-bi^u, m+3) 

1=0 

|i/ 4 J-« 

+ X] [ b (/( S 4^-2) - f{su+l)) ~ db(s4£-2, S4l+l)] 

Li/4J-1 L^J-Mi L*/4J-w-i U/4J-1 



J] /( s 4^+l)+ ^ /( S 4£+2)+ J] /( S 4£+3 



e=o 



1=0 



1=0 

U/4J-1 LV4j~ft 
£=0 £=1 



e=o 



We use the projection property of d (Claim 3), as we have done in the two parts above. Note 



that this property also holds for d;, (for any marker b). Hence, d_&(s4^ ; 54^+3) = d_a(s4^ + i, 54^+2)- 
Similarly, d b (s4^_ 2 , 54^+1) = dj,(s4£_i, s^). In the second summation of the very last line for W + {i), 
we can use projection, modify indices, and replace by X^=n ^ 1 d b ( s 4£+3> s 4£+4)- 
We subtract these bounds and set i = [ . 

W + (i) - W-(i) = b(/(s 4 |) - /(s 4 -_ 4ft+2 )) - d b (y, si) + d b (y, x) + (1 - ^)d_ b (s 45+1 , s 4}+2 ) 
+ Wdb(s 4 -_ 1 ,s 4 -) 

If i = 1 (mod4), in = 1 and 4 /4J = If i = 3 (mod 4), ^ = and 4[i/4j = i-3. Substitution 
completes the proof. □ 



Armed with this handle on the ancestor-descendant relationships, we can prove the progress and 
disjointedness lemmas alluded to inP 



Lemma 14. For odd i, si is not ^sA-unmatched. 

Proof. Suppose Si is M-unmatched. We can now involve Sj in a new matching. Set E rem = E-(i). 
Set E a dd = E + (i) U {sis,Si}. We can remove E rem and add E a dd to get a valid matching. Note 
that this is possible because Si does not participate in a pair of M. So w{E a( id) — w(E rem ) < 0. By 
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Lemma 13| and [Claim 6[ 

> w{E add ) - w(E rem ) = w(E + (i)) - w(E-(i)) + w(si-3, Si) 

> W+(i) - W-(i) + a( Si - 3 , Si , (-ir> +1 b) 

= (_l)^+ 1 b(/( Si _ 1 ) - /( ai _ 3 )) + (-l)^+ 1 b(/( S ,_ 3 ) - /(*)) 

- d b (y, si) + d b (y, x) + d ( _ 1)Mi +i b (si_2, Si-i) - d(_ 1 )„ i +i 6 (si_ 3 , Si) 
= (-l)^ +1 b(/( Si _i) - f(si)) - d b (y, Sl ) + d 6 (y, x) 

(-ir i b(f(s i - 1 )-f(si))>d b (y,x)-d b (y,s 1 ) 
The last equality follows from projection. The final conclusion implies (sj,Sj_i) is a violating pair 
by|Claim 8| □ 



Lemma 15. For odd i, Si ^ X . 



Proof. The proof is similar to that of Lemma 10 We need to sho w that (s^Sj+i) ^ M. Suppose 
i = 1 (mod 4) (the other case has the same argument). Lemma 13 (ii) shows that Sj+i (mod2 fe+1 ) 
are > 2 . Furthermore, (iii) shows that Sj_i (mod 2 b+1 ) < 2 b . Thus, Si (mod2 ft+1 ) > 2 b since S{ = 
Sj_i + 2 fe . Now, if both these remainders are same then 2 b+1 | — Si+i| implying msd(|sj — Sj+i|) > b 
which in turn implies (sj,Sj+i) ^ M. If the remainders are not same, then \s{ — Si+i| < 2 b . This 
implies that 2 b )(\si — which again implies (sj,Sj+i) ^ M. □ 
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